Glottal spectral separation for parametric speech synthesis
نویسندگان
چکیده
The great advantage of using a glottal source model in parametric speech synthesis is the degree of parametric flexibility it gives to transform and model aspects of voice quality and speaker identity. However, few studies have addressed how the glottal source affects the quality of synthetic speech. Here, we have developed the Glottal Spectral Separation (GSS) method which consists of separating the glottal source effects from the spectral envelope of the speech. It enables us to compare the LF-model with the simple impulse excitation, using the same spectral envelope to synthesize speech. The results of a perceptual evaluation showed that the LF-model clearly outperformed the impulse. The GSS method was also used to successfully transform a modal voice into a breathy or tense voice, by modifying the LF-parameters. The proposed technique could be used to improve the speech quality and source parametrization of HMM-based speech synthesizers, which use an impulse excitation.
منابع مشابه
A Simple Continuous Excitation Model for Parametric Vocoding
We describe a continuous-pitch parametric vocoder suitable for speech coding and statistical text to speech synthesis. The spectral model is based on linear prediction. We show that glottal modelling techniques from recent literature can be cherry-picked to produce an excitation signal with properties known to be useful in the above application areas. We further show that the continuous pitch p...
متن کاملPerceptual spectral matching utilizing mel-scale filterbanks for statistical parametric speech synthesis with glottal excitation vocoder
متن کامل
Physiologically-motivated modeling of the voice source in articulatory analysis/synthesis
This paper describes the implementation of a new parametric model of the glottal geometry aimed at improving male and female speech synthesis in the framework of articulatory analysis synthesis. The model represents glottal geometry in terms of inlet and outlet area waveforms and is controlled by parameters that are tightly coupled to physiology, such as vocal fold abduction. It is embedded in ...
متن کاملPhysiologically Motivated Modelling of the Voice Source in Articulatory Analysis/synthesis
This paper describes the implementation of a new parametric model of the glottal geometry aimed at improving male and female speech synthesis in the framework of articulatory analysis synthesis. The model represents glottal geometry in terms of inlet and outlet area waveforms and is controlled by parameters that are tightly coupled to physiology, such as vocal fold abduction. It is embedded in ...
متن کاملSpeech synthesis by structured segments, using temporal decomposition and a glottal excitation
Classical speech synthesis systems either concatenate diphone-like tabulated pattems or reconstmct speech parameters according to pre-defmed mles. Both techniques show drawbacks : the fonner lacks flexibility while the lauer is highly time-consuming_ to built. We propose an intennediate technique using structured segments : segmental units are still resorted to, but they are automatically analy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008